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HIGHLY SCALABLE SYSTEM AND METHOD OF REGULATING 
INTERNET TRAFFIC TO SERVER FARM TO SUPPORT (MIN,MAX) 
BANDWIDTH USAGE-BASED SERVICE LEVEL AGREEMENTS 

BACKGROUND OF THE INVENTION 

1. FIELD OF THE INVENTION 

The present invention generally relates to the global Internet and 
Internet World Wide Web (WWW) sites of various owners that are hosted by a 
service provider using a group of servers that are intended to meet established service 
levels. More particularly, this invention relates to a highly scalable system and 
method for supporting (min^max) based service level agreements on outbound 
bandwidth usage for a plurality of customers by regulating inbound traffic coming to a 
server farm where the server farm is comprised of numerous servers. 

2. DESCRIPTION OF THE PRIOR ART 

The Intemet is the world's largest network and has become essential to 
businesses as well as to consumers. Many businesses have started outsourcing their 
e-business and e-commerce Web sites to service providers instead of running their 
Web sites on their own server(s) and managing them by themselves. Such a service 
provider needs to install a collection of servers (called a Web Server Farm (WSF) or a 
Universal Server Farm (USF)), which can be used by many different businesses to 
support their e-commerce and e-business. These business customers (the service 
provider's "customers") have different "capacity" requirements for their Web sites, 
Web Server Farms are connected to the Intemet via high speed communications links 
such as T3 and OCx links. These links are shared by all of the Web sites and all of the 
users accessing the services hosted by the Web Server Farm. When businesses 
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(hereafter referred to as customers of a server farm, or customers) outsource their 
e-commerce and/or e-business to a service provider, they typically need some 
assurance as to the services they are getting from the service provider for their sites. 
Once the service provider has made a commitment to a customer to provide a certain 
5 level of service (called a Service Level Agreement (SLA)), the provider needs to 
maintain that level of service to that customer. 

A general SLA on communications link bandw^idth usage for a 
customer can be denoted by a pair of bandwidth constraints: the minimum guaranteed 
bandwidth, Bmin(i j), and the maximum bandwidth boxmd, Bmax(i,j), for each i^** 

10 customer's j'^ type or class traffic. The minimum (or min) bandwidth Bmin(i J) is a 
guaranteed bandwidth that the i^*" customer's type traffic will receive regardless of 
the bandwidth usage by other customers. The maximimi (or max) bandwidth 
Bmax(i,j) is an upper boxxnd on the bandwddth that the i* customer's type traffic 
may receive provided that some unused bandwddth is available. Therefore, the range 

15 between Bmin(i,j) and Bmax(i j) represents the bandwddth provided on an "available" 
or "best-effort" basis, and it is not necessarily guaranteed that the customer will obtain 
this bandwidth. In general, the unit cost to use the bandwidth up to Bmin(i J) is less 
than or equal to the unit cost to use the bandwidth between Bmin(i,j) and Bmax(i,j). 
Such a xmit cost assigned to one customer may differ from those assigned to other 

20 customers. 

In the environment of Web site hosting, where communications link(s) 
between the Intemet and a server farm is shared by a number of customers (i.e., traffic 
to and from customer Web sites share the communications link(s)), the bandwidth 
management on the outbound link, i.e., the link from a server farm to the Intemet, is 
25 more important than the bandwidth management on the inbound link since the amount 
of traffic on the outboimd link is many magnitudes greater than that on the inbound 
link. Furthermore, in most cases, the inbound traffic to the server farm is directly 
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responsible for the outbound traffic generated by the server farm. Therefore, the 
constraints Bmin(i j) and Bmax(iJ)) imposed by a service level agreement are 
typically applied to the outbound link bandwidth usage. 

There are two types of bandwidth control systems that have been 
proposed either in the market or in the literature. One type is exemplified by the 
Access Point (AP) products from Lucent/Xedia (www.xedia.com) or by the 
Speed-Class products from PhaseCom (www.speed-demon.com). These products are 
self-contained xmits and they can be applied to regulate the outbound traffic by 
dropping some outboimd packets to meet v^th the (minimum,maximum) bandwidth 
SLA for each customer. The other type of bandwidth control system is exemplified by 
U.S. Patent Application Serial No. [Attomey's Docket No. Y0999-374], commonly 
assigned with the present invention. This system, referred to as Communications 
Bandwidth Management (CBM), operates to keep the generated outboimd traffic 
within the SLAs by regulating the inbotmd traffic that is admitted to a server farm. As 
with the first type of bandwidth control system exemplified by AP and Speed-Class 
products, each CBM is a self-contained unit. 

Bandwidth control systems of the types exemplified by Lucent AP 
noted above can be applied to enforce SLAs on the outbound link usage by each 
customer (and on each customer traffic type). Some of these systems are limited to 
20 supporting the minimum bandwidth SLA while others are able to support the 

(minimum,maximum) bandwidth SLA. A disadvantage with systems that enforce the 
outbound bandwddth SLA by dropping packets already generated by the server farm is 
that they induce undesirable performance instability. That is, when some outbound 
packets must be dropped, each system drops packets randomly, thus leading to 
25 frequent TCP (Transmission Control Protocol) retransmission, then to further 

congestion and packet dropping and eventually to thrashing and slowdown. The CBM 
system noted above solves such a performance instability problem by not admitting 
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inbound traffic whose output cannot be delivered due to exceeding the SLA. The 
major problem of AP and CBM units is that their scalability is limited. A large server 
farm requires more than one unit of AP or CBM unit. However, because each of these 
units is self-contained and standalone, they cannot collaborate to handle the amoimt of 
5 traffic beyond the capacity of a single unit. When a multiple number ("n") of AP or 
CBM systems are needed to be deployed to meet the capacity requirement, each unit 
will handle (l/n)-th of the total bandwidth or traffic, and therefore the sharing of the 
available bandwidth and borrowing of unused bandwidth among customers becomes 
impossible. 

10 From the above, it can be seen that it would be desirable if a system for 

bandwidth control of a server farm were available that overcomes the scalability 
problem while eliminating the performance and bandv^dth sharing shortcomings of 
the prior art. 

SUMMARY OF THE INVENTION 

1 5 The present invention provides a highly scalable system and method 

for guaranteeing and delivering (minimum,maximum) based communications link 
bandwidth SLAs to customers whose applications (e.g., Web sites) are hosted by a 
server farm that consists of a very large number of servers, e.g., hundreds of thousands 
of servers. The system of this invention prevents any single customer (or class of) 

20 traffic fi-om "hogging" the entire bandwidth resource and penalizing others. The 
system accomplishes this in part through a feedback system that enforces the 
outbovind link bandwidth SLAs by regulating the inbound traffic to a server farm. In 
this manner, the system of this invention provides a method by which differentiated 
services can be provided to various types of traffic, the generation of output from a 

25 server farm is avoided if that output cannot be delivered to end users, and any given 
objective fimction is optimized when allocating bandwidth beyond the minimums. 
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The system accomplishes its high scalability by allowing the deployment of more than 
one very simple inbound traffic limiter (or regulator) that performs the rate-based 
traffic admittance and by using a centralized rate scheduling algorithm. The system 
also provides means for any external system or operator to further limit the rates used 
5 by inbound traffic limiters. 

According to industry practice, customers may have an SLA for each 
type or class of traffic, whereby (minimum,maximum) bandwidth bounds are imposed 
in which the minimum bandwidth represents the guaranteed bandvsddth while the 
maximum bandwidth represents the upper bound to the as-available use bandwidth. 

10 The bandwidth control and management system of this invention enforces the 

outbound link bandwidth SLAs by regulating (thus limiting when needed) the various 
customers' inbound traffic to the server farm. Incoming traffic (e.g., Intemet Protocol 
(IP) packets) can be classified into various classes/types (denoted by (ij)) by 
examining the packet destination address and the TCP port number. For each class 

15 (ij), there is a "target" rate denoted as Rt(ij), which is the amount of the i^ customer's 
j**" type traffic that can be admitted within a given service cycle time to the server farm 
which supports the i'*' customer (this mechanism is known as the rate-based 
admittance). A centralized device is provided that computes Rt(i j) using the history 
of admitted inbound traffic to the server farm , the history of rejected (or dropped) 

20 inbound traffic, the history of generated outbound traffic from the server farm, and the 
SLAs. Each dispatcher can use any suitable algorithm to balance the work load to 
servers when dispatching traffic to servers. Once Rt(i j) values have been computed 
by the centralized device, the centralized device relays the Rt(iJ) values to the one or 
more elements (called inbound traffic limiters) that regulate the inbound traffic using 

25 the rates Rt(i j) in a given service cycle time. The above process of computing and 
deploying Rt(i j) values is repeated periodically. This period can be as often as the 
service cycle time. 

In addition to enforcing the outbound link bandwidth SLAs in a highly 
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scalable fashion, other preferred feature of the present invention is the abiUty to 
control the computation of Rt(ij) via an external means such as an operator and any 
server resource manager. Yet other preferred features of the present invention are the 
ability to distribute monitoring and traffic limiting functions even to each individual 
server level. Any existing workload dispatching product(s) can be used with this 
invention to create a large capacity dispatching network. Yet other preferred features 
of the present invention are the capabilities to regulate inbound traffic to alleviate the 
congestion (and thus performance) of other server farm resources (in addition to the 
outbound link bandwidth) such as web servers, data base and transaction servers, and 
the server farm intra-infrastructure. These are achieved by providing "bounds" to the 
centralized device. 

Other objects and advantages of this invention will be better 
appreciated from the following detailed description. 

BRIEF DESCRIPTION OF THE DRAWINGS 

Figure 1 represents a system environment in which Intemet server farm 
traffic is to be controlled and managed with a bandwidth control system in accordance 
with the present invention. 

Figure 2 schematically represents a bandwidth control system operating 
within the system envirormient of Figure 1 in accordance with the present invention. 

Figures 3 and 4 schematically represent two embodiments for an 
inboimd traffic dispatching network represented in Figure 2. 

Figure 5 schematically represents an inboimd traffic limiter algorithm 
for use with inboimd traffic limiters of Figures 2 through 4 and 7. 
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Figure 6 schematically represents a rate scheduling algorithm for 
computing Rt(i j) with an inbound traffic scheduler xmit represented in Figure 2. 

Figure 7 represents an inbound traffic limiting system operating within 
each server in accordance with another embodiment of the present invention. 

DETAILED DESCRIPTION OF THE PREFERRED EMBODIMENT 

Figure 1 schematically represents a system environment in which 
traffic through an Internet server farm 10 can be regulated with a bandwidth control 
system in accordance with the present invention. The Intemet server farm 1 0 is 
represented as being comprised of an inbound traffic (or TCP connection request) 
dispatching network 12 that dispatches inbound traffic 14 to appropriate servers 16. 
The invention is intended for use with a very large number of servers 16 (the size 
beyond the capacity of a single dispatcher unit) of potentially different capacities that 
create the outbound traffic 18 of the server farm 10. An objective of this invention is 
to provide a highly scalable system and method that manages the outboimd bandwidth 
usage of various customers (and thus customer traffic) subject to (min,max) 
bandwidth-based service level agreements (SLAs) by regulating the inboxmd traffic 14 
of various customers. Table 1 contains a summary of symbols and notations used 
throughout the following discussion. 



TABLE! 
i The i^ customer, 

j The j**" traffic type/class, 

k The k* server. 

Ra(iJ,k) Inbound traffic of the j* type of the i^ customer that has been 

admitted at the k^ server. 
Ra(i j) The total inbound traffic of the j* type of the i**" customer that 
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10 



15 



20 



25 



has been admitted; equivalent to the sum of Ra(i j,k) over all k. 
Rr(i J,k) Inbound traffic of the type of the customer that has been 

rejected at the k^ server. 
Rr(i j) The total inboxmd traffic of the type of the i'*" customer that 

has been rejected; equivalent to the sum of Rp(ij,k) over all k. 
Rt(iJ,k) The allowable (target) traffic rate for the i^ customer's j^** type 

traffic at the k^*" server. 
Rt(ij) The total allowable traffic rate for the i^ customer's j^ type 

traffic, equivalent to the sum of Rt(ij,k) over all k. 
B(i j,k) The i* customer's j^ type outbound traffic from the k'*" server. 

B(ij) The total of the i* customer's j**" type outbound traffic, 

equivalent to the sum of B(ij,k) over all k. 
b(i j) The expected bandwidth usage by a unit of inboimd traffic type 

(ij). 

C(i j,k) The server resource (processing capacity) that is allocated to the 

i* customer's j^ type traffic at the k^ server. 

C(i j) The total processing capacity that is allocated to the i**" 

customer's j* type traffic, equivalent to the sum of C(iJ,k) over 
all k. 

c(i j) The expected server resource usage by a unit of (iJ) traffic. 

Bmin(i j) The gu£tranteed outbound bandwidth usage on the i**" customer's 

j* type traffic. 

Bmax(i j) The maximxim on the outbound bandwidth usage on the i* 

customer's j'*" type traffic. 
Btotal The total usable bandwidth available for allocation. 

Rbound(i j) An optional bound on Rt(i J) that may be set manually or by any 

other resource manager of a server farm. 



Figure 2 schematically illustrates an embodiment of the invention 
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operating within the system environment shown in Figure 1 . A unit referred to herein 
as the inbound traffic scheduler (ITS) unit 20 is employed to observe the amount of 
incoming traffic 14 that consists of the amount of admitted inbound traffic and the 
amount of rejected traffic. The inboimd traffic dispatching network 12 monitors both 
5 admitted and rejected traffic amount. The ITS unit 20 also observes outboimd traffic 
18. The ITS unit 20 then computes the expected amount of outbound traffic that 
would be generated when one unit of traffic is admitted to a server 16, computes the 
inbound traffic target rates, and informs the rates to an inboimd traffic limiter (ITL) 
22. The ITL 22 then regulates the arriving inbound traffic 14 by imposing target rates 
10 at which inbound traffic 14 is admitted. Each of these functions is performed for the 
i* customer's j^ class traffic within a service cycle time, which is a unit of time or 
period that is repeated. Optionally observed by the ITS unit 20 is the average resource 
usage c(ij) by a unit of type (ij) inboimd traffic 14. 

As indicated in Table 1, Ra(i j) denotes the amount of inbound traffic 
15 14 admitted and Rr(iJ) denotes the amount of inbound traffic 14 rejected. Both are 
obtained by the ITS unit 20 during a service cycle time (thus representing a rate) for 
the i^ customer's j^*" class traffic. yRr(i j) greater than zero implies that the Rr(iJ) 
amount of traffic was rejected due to inboimd traffic 14 exceeding the usage of the 
agreed upon outbound bandwidth. Rt(iJ) denotes the allowable (thus targeted) portion 
20 of inbound traffic 14 within a service cycle time for the i**" customer's j^ class traffic. 
Here, Ra(iJ) is smaller than or equal to Rt(iJ) as a result of the operation of the ITL 
22. B(iJ) denotes the total amount of outbound traffic 18 generated for the i^ 
customer's j* class traffic within a service cycle time, and c(ij) denotes the average 
resource usage by a unit of type (i j) inbound traffic 14. An example of c(i j) is the 
25 CPU cycles required to process one (ij) type request. Finally, Rbound(iJ) denotes 
the absolute bound on Rt(ij) when the ITS 20 computes new Rt(i,j). 



In accordance with the above, the following operations will be 
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completed during a service cycle time: 

(a) The ITS unit 20 collects Ra(i j), Rr(i j), B(iJ) and optionally c(i j), and 
computes b(i j), the expected amount of output that would be generated when one unit 
of traffic type (ij) is processed by a server 16. The ITS imit 20 also collects 

5 Rbound(i j) when available. 

(b) The ITS imit 20 runs a rate scheduling (or freight load scheduling) 
algorithm to determine the best target values for Rt(i j). The ITS imit 20 may then 
compute Rt(ij,k) if needed for each server 16. The ITS unit 20 then relays Rt(ij) 
values to one or more inboxmd traffic limiters (ITL) 22. 

(c) The ITL 22 admits inboimd traffic 14 at the rate Rt(i j) in each service 
cycle time. 

The inboimd traffic dispatching network 12 has an inboxmd traffic 
monitor (ITM) 24 that observes the admitted traffic rates Ra(i j) and the rejected 
traffic rates Rr(i J), and relays these rates to the ITS unit 20. Within the inboimd 
traffic dispatching network 1 2, there could be more than one inbound traffic limiter 
(ITL) 22 and more than one inbound traffic monitor (ITM) 24. Although the inbound 
traffic monitor (ITM) 24 and inbound traffic limiter (ITL) 22 functions are shown and 
described as being associated v^th the inbound traffic dispatching network 12, these 
functions could be completely distributed to each individual server 16, as will be 
discussed below. Since the ITL 22 regulates the inbound traffic, it is convenient to 
put the inbound traffic monitoring functions at the ITL 22. 

As also shown in Figure 2, each server 16 may have a resource usage 
monitor (RUM) 26 that observes server resource usage, c(i j), and an outbound traffic 
monitor (OTM) 28 that observes the outbound traffic, B(i j), both of which are relayed 
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to the ITS unit 20. There are a number of ways to observe the outbound traffic 1 8, 
B(i j), and any of which would be suitable for purposes of the present invention. The 
ITS unit 20 collects Ra(ij), Rr(i j), B(i j) and optionally Rbound(ij) and c(ij), and 
then computes the optimum values for Rt(iJ) that meet the service level agreements 
5 (SLAs) and relays these values to one or more ITLs 22. As represented in Figure 2, a 
server resource manager 21 is an optional means and its responsibility is to provide 
the absolute boimd Rbound (i j) on the rate Rt(io) regardless of the Bmax(i j) given in 
the (min,max) SLAs. 

Figures 3 and 4 schematically represent how the inbound traffic 

10 dispatching network 12 can be rendered highly scalable (large capacity) using existing 
dispatchers and a high-speed LAN (HS-LAN). In Figure 3, the inbound traffic 
limiting function and the inbound traffic monitoring function of the ITL 22 and ITM 
24, respectively, are assigned to a standalone ITL unit 30, while in Figure 4 the 
inbound traffic limiting function and the inbound traffic monitoring function are 

1 5 assigned to each of a number of dispatchers 42, 44 and 46. With reference to Figure 
3, the ITL xinit 30 is connected to dispatchers 32, 34 and 36 via a high-speed LAN 
(HS-LAN) 3 1 . The primary responsibility of the ITL unit 30 is to limit (thus dropping 
when needed) the inbound traffic (ij) 14 by applying the target rates Rt(i,j) given by 
the ITS unit 20. While doing so, ITL unit 30 also monitors both admitted traffic 

20 Ra(i j) and rejected traffic Rr(i j). Each dispatcher 32, 34 and 36 is responsible for 
dispatching (or load balancing) received traffic to associated servers 1 6 using any of 
its own load balancing algorithms. The traffic admittance algorithm used by the ITL 
22 associated with the unit 30 for rate-based admittance is referred to as the rate-based 
inbound traffic regulation algorithm. While only one ITL unit 30 is represented in 

25 Figure 3, additional ITLs can be added to the high-speed LAN (HS-LAN) 31 to 
achieve even higher capacity, thus achieving higher scalability. 



The inbound traffic dispatching network 12 of Figure 4 is structured 
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similarly to that of Figure 3, with a difference being that the inbound traffic Umiting 
function and the inbound traffic monitoring function are assigned to each dispatcher 
42, 44 and 46. Inbound traffic 14 are sent to dispatchers 42, 44 and 46 via a high- 
speed LAN (HS-LAN) 31. The dispatchers 42, 44 and 46 with the ITL functionality 
5 are responsible for regulating the inbound traffic 14 prior to dispatching traffic to the 
servers 16. In this embodiment, both ITL and ITM functionalities become the added 
functionalities to any existing dispatcher (or load balancing) units. 

f=^5 Figure 5 schematically represents the rate-based inbound traffic 

Jtf regulation algorithm executed by each ITL 22. This algorithm is repeated with each 

10 service cycle. Step 53 checks if the cycle-time has expired or not. If not expired, the 

iiJ 

pj algorithm moves to step 55. When the cycle-time has expired, the algorithm executes 

r'l step 54, gets a new set of Rt(ij) values if available, resets any other control and 

counter variables, and resets both Ra(ij) and Rr(ij) to zero for all i and j. Step 55 
£ determines to which customer and traffic type (i J) the received TCP connection 

% 15 request packet in step 50 belongs to so that a proper rate Rt(i,j) can be applied. In step 
□ 56, the algorithm checks whether or not the received TCP connection request packet 

of type (ij) can be admitted by comparing Ra(ij) against Rt(iJ). In step 56, if Ra(i,j) 
is less than Rt(i j), the received TCP connection request packet is admitted by 
executing step 57. Step 57 increments Ra(i j) by one and admits the packet. In step 
20 56, if Ra(ij) has reached Rt(ij), step 58 is executed. Step 58 increments Rr(ij) by 
one and rejects (or drops) the received TCP connection request packet. Both step 57 
and 58 lead to step 50. Step 50 gets a packet from inbound traffic 14. Step 51 checks 
whether or not the received packet is a TCP connection request. If not, the packet is 
simply admitted. If yes, step 53 is executed. 



25 



Figure 6 schematically represents an algorithm referred to above as the 
rate scheduling algorithm, which is executed by the ITS imit 20 to determine the 
optimum values for Rt(ij) for all i and j. This scheduling algorithm starts at step 1 
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(61), which examines whether or not the service level agreements (SLAs) are all 
satisfied. Step 1 computes b(i j) using the formula: 

b(i j) = a (B(i j) / Ra(ij)) + (1 - a) b(ij) 

where b(i j) is the expected bandwidth usage per unit of inbound traffic 14, Ra(i j) is 
the admitted inbound traffic, B(iJ) is the observed i^ customer's j**" type traffic total, 
and a is a value between 0 and 1 , 

Step 1 adjusts Bmax(ij) by choosing the minimum of Bmax(ij) itself 
and an optionally given bound Rbound(ij)*b(i,j). Here Rbound(i j) is an "absolute 
bound" on Rt(i j). Since Bmin(iJ) must be less than or equal to Bmax(i j), this 
adjustment may affect to the value of Bmin(ij) as well. Step 1 then computes Bt(ij) 
and Bt and checks whether or not the generated outbound traffic is currently 
exceeding the total usable bandwidth Btotal (that is detecting the outboimd link 
congestion). If the congestion on the outbound link has been detected, step 2 (62) is 
executed. If there was no congestion detected and no packet dropping (Rr(ij) = 0) 
and no SLA has been violated, the algorithm moves to step 5 (65) and stops. 
Otherwise, step 1 moves to step 2 (62). 

Step 2 (62) first computes the bandwidth requirement Bt(iJ) had no 
packets been dropped, that is the total inbound traffic (Ra(iJ) + Rr(iJ)) for all (i j) had 
been admitted. This bandwidth requirement Bt(iJ) could not exceed Bmax(iJ) and 
thus it is bounded by Bmax(i j). Step 2 then checks if the bandwidth requirements 
Bt(i j) for all (i j) can be supported without congesting the outbound link. If so, step 2 
moves to step 4 (64) to convert the targeted bandwidth requirement to the targeted 
rates. If step 2 detects a possible congestion (Bt > Btotal), it then moves to step 3 (63) 
to adjust those Bt(i j) computed in step 2 (62) so that the link level congestion could 
be avoided while guaranteeing the minimum bandwidth Bmin(i j) for every (i j). 
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In step 3, two options are described: a first allows "bandwidth- 
borrowing" among customers, wWlain^the secon^^ among 
customers^ not^Uowed. Here, "bandwidth borrowing" means letting some 
customers use the portion of the minimum guaranteed bandwidth not used by other 
5 customers. Step 3 first computes the "shareable" bandwidth. Step 3 then allocates (or 
prorates) the shareable bandwidth among those customer traffic classes that are 
demanding more than the guaranteed bandwidth Bmin(iJ). Although step 3 describes 
the use of "fair prorating of shareable bandwidth", this allocation discipline can be 
replaced by any other allocation discipline such as "weighted priority" or "weighted 
10 cost". 

In step 4 (64), the bandwidth use targets Bt(i j) computed in step 3 are 
converted to the target inbound traffic rates Rt(ij). When Bt(i j) is less than or equal 
to the guaranteed minimum Bmin(i J), there should be no "throttling" of the inbound 
traffic. Therefore, Bt(ij) is set to Bmax(i j) for such (ij) prior to converting Bt(i J) to 
1 5 Rt(i,j). In step 4, if the target rates are used by servers (as will be described later in 
Figure 7), Rt(i j,k) must be computed fi-om Rt(iJ) to balance the response time given 
by various servers 16 for each pair of (iJ) among all k. Doing so is equivalent to 
making the residual capacity or resource of all servers 16 equal, expressed by: 

C(ij,l) - Rt(iJ,l) c(ij) = C(iJ,2) - Rt(iJ,2) c(ij) = ... C(iJ,n) - Rt(iJ,n) c(ij) = d 

20 where C(i j,k) is the total resource allocated at server k for handling the traffic class 
(iJ), c(ij) is the expected resource usage by a unit of (i j) traffic and d is a derived 
value. Since 

Rt(ij) = SUM of Rt(iJ,k) for all k = SUM of (C(iJ,k) - d) / c((ij) for all k 



one can derive d firom the above formula. Assuming a total of n servers: 
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d = (C(iJ)-Rt(ij)c(ij))/n 



where C(ij) is the sum of C(ij,k) over all k, and the formula for deriving Rt(ij,k) 
from Rt(iJ) is 



Rt(i j,k) = (C(iJ,k) - (C(iJ) - Rt(ij) c(ij)) / n) / c(i j) 



5 Step 4 (64) leads to step 5 (65) and the rate scheduling algorithm stops. 

Finally, Figure 7 represents a system in which the inbound traffic 
monitoring function (ITM) 70, inboimd traffic limiting function (ITL) 72 and 
outbound traffic monitoring function (OTM) 74 are distributed to each server 16. 
Also distributed to each server 16 is resource use monitoring function (RUM) 76. 

10 This system makes the inboimd traffic dispatching network 12 in Figure 7 extremely 
simple. The inbound traffic dispatching network 12 of Figure 7 is very much like the 
one illustrated in Figure 4 except the dispatchers 42, 44 and 46 are simply replaced by 
dispatchers 32, 34 and 36. In this case, the ITS 20 executes the rate scheduling 
algorithm and derives Rt(i,j,k) from Rt(ij) for every k. As in the case of the ITS 20 in 

15 Figure 2, the ITS 20 in Figure 7 gets Rboxmd(iJ) from any server resource manager 
21. The ITS 20 uses c(ij), the average of c(ij,k) over k, in the derivation of Rt(iJ,k) 
from Rt(iJ). c(ij,k) are observed by the resource utilization monitoring function 
(RUM) 76 that resides in each server 16. Furthermore, each ITL 72 executes the rate- 
based inbound traffic regulation algorithm for (ij,k) in place of (ij) described in 

20 reference to Figure 5. The ITS 20 relays Rt(ij,k) values to each server k. 



While the invention has been described in terms of a preferred 
embodiment, it is apparent that other forms could be adopted by one skilled in the art. 
Accordingly, the scope of the invention is to be limited only by the following claims. 



